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Abstract 

Visual  recognition,  navigation,  tracking  and  imagery  are  posited  to  involve 
some  of  the  same  types  of  representations  and  processes.  The  first  part  of 
this  paper  develops  a  theory  of  some  of  the  shared  types  of  representations  and 
processing  modules.  This  theory  is  developed  in  light  of  computational, 
neuroanatomical ,  neurophysiological,  and  behavioral  considerations.  The  second 
part  of  the  paper  develops  a  mechanism  for  the  development  of  lateralization  of 
visual  function  in  the  brain.  This  theory  leads  to  predictions  about  the 
lateralization  of  the  putative  processing  modules.  The  third  part  of  the  paper 
examines  critical  tests  of  these  predictions,  and  reviews  relevant  empirical 
findings  in  the  literature. 
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Visual  Hemispheric  Specialization: 
A  Computational  Theory 


There  has  been  great  interest  in  how  the  two  cerebral  hemispheres  are 
specialized  for  visual  processing  since  the  time  of  John  Hughlings  Jackson,  who 
in  1874  reported  an  apparent  right-hemisphere  specialization  for  recognition. 
Theories  of  visual  specialization,  following  the  fashion  in  the  field  of 
neuropsychology,  have  tended  to  focus  on  various  dichotomies;  for  example,  the 
right  hemisphere  has  been  said  to  be  specialized  for  information  about  global 
shape  and  the  left  specialized  for  information  about  details  (see  Springer  and 
Deutsch,  1981).  This  strategy,  of  trying  to  discover  a  dimension  that  will 
capture  the  differences  in  processing,  has  much  to  recommend  it.  Indeed,  if 
there  are  general  principles  that  distinguish  types  of  processing  systems,  then 
those  systems  should  be  able  to  be  characterized  in  terms  of  sets  of  such 
dimensions.  However,  the  dimensions  that  have  been  explored  to  date  have  not 
been  closely  related  to  theories  of  processing  systems,  and  have  generally  not 
been  well  motivated.  Rather,  the  dimensions  chosen  typically  are  selected  on 
the  basis  of  intuition  and  apparent  descriptive  power. 

In  this  paper  I  present  an  alternative  way  of  attempting  to  understand 
visual  hemispheric  specialization.  This  approach  is  based  on  the  idea  of 
"natural  computation"  (see  Marr,  1982),  in  which  we  try  to  understand  the  brain 
in  terms  of  components  that  interpret  and  transform  data  in  various  ways.  The 
theory  we  develop  here  focuses  on  "high  level"  visual  processes,  which  can  be 
characterized  as  those  processes  that  can  be  directly  altered  by  one's 
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knowledge  and  beliefs.  (Thus,  although  one  may  believe  that  a  bright  light 
will  occur  and  so  close  one's  eyelids,  retinal  processing  is  not  considered 
"high  level"  because  it  is  not  directly  altered  by  the  belief.)  We  focus  on 
this  class  of  processing  primarily  because  there  is  no  evidence  that  low-level. 


sensory  visual  processing  is  lateralized  (e.g.,  see  Berkley,  Kitterle,  and 
Watkins,  1975;  Di  Lollo,  1981;  Rijsdijk,  Kroon,  and  Van  der  Wildt,  1980). 

On  this  characterization,  then,  high  level  visual  processing  is 
involved  in  visual  recognition,  navigation,  tracking,  and  mental  imagery.  We 
are  particularly  Interested  in  how  experience  can  play  a  role  in  the 
organization  of  such  visual  functions  in  the  two  cerebral  hemispheres.  In 
developing  this  theory  we  will  make  use  of  neurophysiological  and 
neuroanatomical  data  from  non-human  primates,  computational  constraints,  and 
behavioral  data  from  human  subjects.  We  begin  by  considering  what  vision  and 
imagery  are  for,  and  derive  computational  constraints  from  this  analysis. 

I.  VISUAL  PERCEPTION  AND  VISUAL  IMAGERY 

Before  beginning  to  formulate  a  theory  of  how  a  function  might  be 
carried  out  by  the  brain,  it  is  useful  to  begin  by  considering  the  purpose  (or 
pur>oses)  of  that  function.  Vision  has  two  primary  purposes:  First,  we  try  to 
recognize  objects  and  parts  thereof.  This  function  allows  us  to  apply 
previously  gained  knowledge  to  newly  encountered  objects.  For  example,  once 
one  has  recognized  something  as  an  apple,  one  knows  that  it  is  edible,  has 
seeds  inside,  and  so  on.  In  order  to  carry  out  this  function,  visual  input 
must  be  encoded  in  such  a  way  that  it  makes  contact  with  the  appropriate 
previously  stored  information  (see  Marr,  1982).  Second,  we  use  vision  to 
navigate  though  space  (and  not  bump  into  objects  or  walk  into  holes)  and  to 
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Crack  moving  objects  (avoiding  or  intercepting  them,  as  is  appropriate).  In 
these  cases,  the  goal  is  not  to  encode  information  in  order  to  access  relevant 
memory  representations.  Rather,  the  goal  is  to  compute  metric  spatial 
relationships  and  to  update  them  as  objects  move  relative  to  one  another. 

It  is  interesting  that  the  purposes  of  imagery  parallel  those  of 
vision.  Perhaps  this  is  not  surprising,  given  that  virtually  all  definitions 
or  characterizations  of  imagery  hinge  on  its  its  similarity  to  like-modality 
perception.  For  example,  visual  imagery  is  usually  characterized  as  "the 
experience  of  seeing  in  the  absence  of  the  appropriate  sensory  input”  or  the 
like.  Indeed,  having  an  image  produces  the  conscious  experience  of  ’‘seeing", 
but  with  the  "mind's  eye”  rather  than  with  real  ones.  [FOOTNOTE  11  One  purpose 
of  imagery  uses  recognition  to  make  explict  information  stored  implicitly  in 
memory.  That  is,  we  encode  patterns  without  classifying  them  in  all  possible 
ways;  indeed,  there  may  be  an  infinite  number  of  ways  to  classify  a  shape 
(e.g.,  relative  lengths  along  all  possible  pairs  of  axes).  In  order  to  make 
explicit  a  particular  aspect  of  a  remembered  pattern,  we  may  form  an  image  and 
“internally  recognize"  that  aspect  of  It.  That  is,  we  "recognize”  parts  and 
properties  of  imaged  objects  we  had  not  previously  considered.  For  example, 
consider  how  you  answer  the  following  questions:  What  shape  are  a  beagle's 
ears?  Which  is  darker  green,  a  Christmas  tree  or  a  frozen  pea?  Which  is 
bigger,  a  tennis  ball  or  an  orange?  Most  people  claim  that  they  visualize  the 
objects  and  "look"  at  them  in  order  to  answer  these  questions  (and  the 
behavioral  data  support  this  claim;  see  Kosslyn,  1980).  Imagery  is  most  often 
used  in  memory  retrieval  when  the  to-be-remembered  information  is  a  subtle 
visual  property  that  has  not  been  explicitly  considered  previously  and  cannot 
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be  easily  deduced  from  other  facts  (e.g.,  information  about  the  category  in 
general;  see  Kosslyn  and  Jolicoeur,,  1930). 

A  second  purpose  of  visual  imagery  parallels  the  perceptual  mapping  and 
tracking  functions  of  perception.  Imagery  is  a  way  of  anticipating  what  would 
happen  if  we  were  to  move  in  a  particular  way  or  if  something  else  is  moving 
relative  to  us.  That  is,  we  use  imagery  to  perform  "mental  simulations," 
looking  to  "see"  what  would  happen  in  the  analogous  physical  situation.  For 
example,  we  might  imagine  a  jar  and  "see"  if  there  is  room  for  it  at  a  given 
spot  on  the  refrigerator  shelf,  or  we  might  mentally  project  an  object's 
trajectory,  "seeing"  where  it  will  hit.  Imagery  is  used  here  when  one  reasons 
about  visual  appearances  of  objects  under  transformation,  especially  when 
subtle  visual  relations  are  involved. 

Finally,  we  can  use  our  Imagery  abilities  in  the  service  of  more 
abstract  thinking  and  learning.  Shepard  and  Cooper  (1982)  review  numerous 
cases  of  scientific  problem-solving  in  which  "imaged  models"  were  used  as  aids 
to  reasoning.  Einstein,  for  example,  claimed  that  his  first  insight  into 
relativity  theory  arose  when  he  considered  what  he  "saw"  when  he  imaged  chasing 
after  and  matching  the  speed  of  a  beam  of  light.  However,  these  kinds  of  uses 
of  imagery  seem  to  rely  on  the  first  two  uses  of  imagery:  In  visual  thinking 
and  learning,  we  use  imagery  as  a  way  of  retrieving  tacit  knowledge  from  memory 
or  as  a  way  of  performing  mental  simulations. 

Given  the  apparent  parallels  between  the  purposes  of  imagery  and 
vision,  it  is  not  surprising  that  much  empirical  research  has  demonstrated  that 
imagery  and  like-modality  perception  utilize  some  common  processing  mechanisms 
(for  reviews  see  Finke,  1980;  Finke  and  Shepard,  in  press;  Kosslyn,  1980,  1983; 
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Shepard  and  Cooper,  1982).  For  example,  if  one  is  holding  in  mind  a  visual 
image  (e.g.,  of  a  flower),  this  will  impair  visual  perception  more  than  it 
impairs  auditory  perception,  but  vice  versa  if  one  is  holding  in  mind  an 
auditory  image  (e.g.,  the  sound  of  a  telephone  ringing;  see  Segal,  1971). 
Perhaps  most  interesting,  manipulating  objects  in  images  reveals  time-courses 
like  those  observed  in  the  real  world.  For  example,  Figure  1  illustrates  pairs 
of  stimuli  used  in  a  classic  study  by  Shepard  and  Metzler  (1971).  They  asked 
subjects  to  decide  if  the  objects  were  the  same  or  different  shape, 
irrespective  of  their  orientation.  Figure  2  presents  the  results,  indicating  a 
highly  linear  Increase  in  decision  time  as  more  mental  rotation  was  required  to 
bring  the  forms  into  congruence.  This  result  is  impressive  because  images  are 
not  actual,  rigid  objects,  and  hence  are  not  constrained  by  physics  to  have  to 
pass  through  intermediate  positions  when  the  orientation  of  an  imaged  object  is 
changed.  Similar  results  are  obtained  with  image  scanning.  Kosslyn,  Ball  and 
Reiser  (1978)  asked  subjects  to  close  their  eyes  and  imagine  the  map 
illustrated  in  Figure  3.  This  map  had  seven  locations,  which  were  positioned 
so  that  there  were  21  distinct  inter-location  distances  between  all  possible 
pairs.  The  subjects  began  by  "focusing"  on  a  given  location  on  the  imagined 
map  (e.g.,  the  tree),  and  then  decided  whether  a  second  named  location  was  or 
was  not  on  the  map  (e.g.,  the  hut  versus  a  bench);  they  were  asked  to  respond 
in  the  affirmative  only  after  they  had  the  second  object  clearly  in  focus.  As 
is  evident  in  Figure  4,  increasingly  more  time  was  required  to  scan  between 
pairs  of  locations  that  were  increasingly  farther  apart  on  the  map,  indicating 
that  an  imaged  map  can  "stand  in"  for  the  actual  one.  Kosslyn  (1975)  reports 
another  finding  that  is  especially  suggestive:  If  an  object  is  imagined  at  a 
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small  size,  more  time  is  required  to  "see"  its  parts  than  if  the  object  is 
imagined  at  a  larger  size  (see  Kosslyn,  1980,  1983).  This  result  is  intriguing 
because  it  suggests  that  objects  in  images  are  subject  to  spatial  summation,  a 
well-known  property  of  neural  mechanisms  used  in  vision. 


INSERT  FIGURES  1,  2,  3,  4  ABOUT  HERE 


Three  Problems  in  Vision 

Although  the  behavioral  phenonena  reveal  that  imagery  and  like-modality 
perception  share  underlying  mechanisms,  they  do  not  provide  much  illumination 
on  the  nature  of  those  mechanisms.  However,  such  data  become  very  useful  when 
we  consider  them  in  combination  with  neuropsychological  findings  and 
computational  theorizing.  Indeed,  behavioral  and  neuropsychological  data  are 
especially  useful  in  guiding  one  to  formulate  what  Marr  (1982)  called  a  "theory 
of  the  computation."  That  is,  a  computation  can  be  regarded  as  a  "black  box" 
that  transforms  input  in  a  systematic,  informationally-interpretable  way.  A 
theory  of  a  computation  specifies  what  must  be  computed  and  why.  Such  a  theory 
justifies  positing  a  given  computation  by  an  analysis  of  what  problems  must  be 
solved  and  the  requirements  on  the  solution  to  those  problems.  The  goal  of  a 
computation  is  specified,  as  well  as  the  nature  of  the  input  and  constraints  on 
the  solution.  This  sort  of  theory  is  to  be  distinguished  from  a  "theory  of  the 
algorithm,"  which  specifies  the  specific  steps  actually  used  to  carry  out  a 
computation.  The  theory  of  the  computation  is  a  fundamental  step,  outlining 
the  basic  "processing  components"  that  should  be  included  in  the  theory;  the 
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theory  of  the  algorithm  fleshes  out  the  details  of  how  the  computations  are 
performed.  In  this  paper  we  will  concentrate  on  the  first  level,  focusing  on  a 
theory  of  the  computations  used  in  one  aspect  of  imagery. 

We  can  use  the  inference  that  imagery  shares  mechanisms  with  perception 
to  discover  a  remarkable  amount  about  the  structure  of  the  information 
processing  system  underlying  high-level  vision.  We  do  so  by  first  considering 
some  fundamental  problems  that  must  be  solved  by  a  visual  system.  Our  brains 
have  apparently  solved  these  problems  in  specific  ways,  and  the  outlines  of 
these  solutions  are  now  apparent  in  the  literature  on  the  neurophysiology  and 
neuroanatomy  of  visual  perception;  these  solutions  have  direct  implications  for 
a  theory  of  visual  processing. 

Thus,  in  this  section  we  will  begin  to  develop  theories  of  some  of  the 
high-level  computations  that  are  performed  by  the  visual  system.  We  will  do  so 
by  considering  three  problems  which  must  be  solved  by  any  visual  system  and  the 
apparent  solutions  to  these  problems  adopted  by  primate  brains.  In  the 
following  section  we  will  explore  the  implications  of  these  inferences  for  a 
theory  of  imagery,  assuming  that  visual  Imagery  makes  use  of  visual  processing 
mechanisms. 

1 .  The  Problem  of  Position  Variability. 

The  same  object  is  likely  to  occur  at  various  positions  in  the  visual 
field.  Nevertheless,  once  we  have  seen  an  object,  we  can  recognize  it  just  as 
easily  when  it  subsequently  is  in  a  different  position  in  the  field. 

Logically,  there  are  two  ways  we  could  perform  this  feat:  On  the  one  hand, 
when  an  object  is  encoded  initially,  the  visual  system  could  associate  a 
separate  representation  with  each  of  the  possible  positions  of  the  object. 
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This  is  one  interpretation  of  the  mechanism  suggested  by  McClelland  and 
Rumelhart  (1981)  in  their  theory  of  word  perception.  They  associate  a 
I  representation  of  each  letter  with  each  position  in  the  field.  (This  was  done 

so  that  the  same  letter  could  be  detected  in  more  than  one  position  in  a  word.) 
On  the  other  hand,  when  an  object  is  encoded  initially,  it  could  be  stored 
I  using  repre?  *r rations  that  are  associated  with  a  set  of  positions  in  the  field. 

In  the  limit,  only  one  representation  would  be  used  for  all  positions.  This  is 
the  solution  Marr  (1982)  offered  for  the  position  variability  problem;  Marr 
suggested  that  the  appearance  of  objects  is  stored  in  "object-centered" 
representations.  In  such  representations,  the  locations  of  parts  of  objects 
are  specified  relative  to  other  parts,  not  to  positions  in  space. 

The  solution  adopted  by  primate  visual  systems  to  the  problem  of 
position  variability  is  now  evident  in  the  neurophysiological  literature:  It 
has  been  found  in  primates  that  visual  cells  in  area  TE  (near  the  anterior  end 
of  the  inferior  temporal  lobe)  have  very  large  receptive  fields,  and  respond 
when  patterns  are  present  over  a  wide  range  of  positions  (the  receptive  field 
sizes  are  usually  larger  than  20  x  20  degrees  of  visual  angle) .  This  area  of 
the  brain  has  been  shown  to  be  critically  involved  in  recognition  per  se  (see 
Mishkin,  1982).  Thus,  the  primate's  solution  to  the  position  variability 
problem  relies  on  not  representing  the  position  of  a  pattern  in  the  high-level 
shape  representation  system.  (Incidentally,  this  is  a  good  example  of  how 
facts  about  the  neurological  underpinnings  of  behavior  can  have  direct  bearing 
on  theories  of  cognition;  this  finding  is  a  significant  challenge  to  the 
McClelland  and  Rumelhart  model.) 

One  implication  of  this  solution  is  that  only  one  shape  can  be 
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recognized  at  a  time  (although  we  could  rapidly  switch  back  and  forth  between 
stimuli,  only  one  would  be  processed  at  any  given  instant);  if  multiple  stimuli 
were  being  processed  simultaneously,  the  large  receptive  fields  would  often 
result  in  the  system' s  not  being  able  to  tell  if  there  is  one  stimulus  or  two 
of  the  same  stimuli  being  presented  in  different  locations  (e.g.,  the  letter  A 

in  a  word).  Hence,  figure/ground  segregation  is  necessary  to  isolate 

individual  patterns  before  they  can  be  processed  further.  If  this  is  done, 

then  duplicate  patterns  can  be  isolated  and  processed  separately,  preventing 
confusions  about  how  many  of  a  pattern  are  present  in  the  field. 

However,  we  jio  know  where  an  object  is  when  we  see  it.  Thus,  there 
must  be  a  separate  representation  of  an  object's  location,  which  implies  two 
separate  mechanisms — one  to  represent  a  shape  independently  of  its  position  and 
one  to  represent  Its  position.  And  in  fact,  Ungerleider  and  Mishkin  (1982a) 
summarize  evidence  for  "two  cortical  visual  systems."  Their  claim  is  that  the 
ventral  system,  running  from  area  OC  (primary  visual  cortex)  through  TEO  down 
to  TE,  is  concerned  with  analyzing  what  an  object  Is,  whereas  the  dorsal 
system,  running  almost  directly  from  circumstriate  area  OB  to  OA  and  then  to  PG 
(in  the  parietal  lobe)  is  concerned  with  analyzing  where  an  object  is.  Figure 
5  illustrates  the  relevant  areas  of  the  primate  brain. 


INSERT  FIGURE  5  ABOUT  HERE 


Two  sorts  of  data  are  relevant  to  Ungerleider  and  Mishkin's  claim. 
First,  the  neuroanatomy  and  neurophysiology  support  this  distinction.  There 
are  well-known  neural  connections  running  along  both  pathways,  and  the  visual 
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properties  of  these  areas  have  been  well  documented  (e.g.,  see  Ungerleider  and 
Mishkin,  1982b).  The  visual  areas  of  the  parietal  lobe  appear  to  have 
different  properties  from  those  of  the  ventral  visual  system  (e.g.,  they 
include  the  fovea  in  their  receptive  fields  less  often).  Second,  behavioral 
evidence  suggests  that  animals  are  severely  impaired  in  their  ability  to  learn 
to  discriminate  between  patterns  if  the  inferior  temporal  lobes  are  removed. 
However  this  lesion  does  not  disrupt  their  ability  to  learn  locations.  On  the 
other  hand,  if  the  parietal  lobes  are  removed,  animals  are  severely  impaired  in 
their  ability  to  discriminate  on  the  basis  of  location,  although  they  retain 
the  ability  to  discriminate  between  patterns  (see  Ungerleider  and  Miskin, 

1982a, b). 

This  interesting  design  of  the  processing  mechanisms  leads  to 
difficulties  that  must  be  overcome  by  the  system,  as  is  evident  when  we 
consider  another  problem  of  visual  perception. 

2.  The  Problem  of  Figure/Ground  Segregation. 

Before  we  can  recognize  an  object,  "figure”  must  be  segregated  from 
"ground";  one  must  somehow  pick  out  regions  that  are  likely  to  correspond  to 
distinct  objects.  The  magnitude  of  difficulty  of  this  problem  becomes  evident 
if  you  look  at  a  digitized  representation  of  a  picture,  with  numbers 
representing  the  intensity  of  light  at  each  point;  the  objects  are  overwhelmed 
by  differences  in  lighting,  texture,  and  so  on,  and  it  is  very  difficult  to 
pick  them  out.  A  figure  must  be  selected  on  the  basis  of  physical  properties 
of  the  input,  such  as  regions  of  homogeneous  color  or  texture,  or  contiguous 
zero-crossings  in  the  second  derivative  of  the  function  relating  intensity  to 
position  (which  occur  at  the  edges  of  objects;  see  Marr,  1982).  That  is. 
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because  one  has  not  yet  identified  the  object  (segregating  its  form  from  the 
background  is  a  logical  prerequisite  to  recognition),  one  can  only  use  physical 
parameters  to  parse  figure  from  ground.  There  are  numerous  proposals  in  the 
computer  vision  literature  for  ways  of  organizing  input  into  regions  likely  to 
correspond  to  figures  (e.g.,  see  Ballard  and  Brown,  1982). 

An  interesting  problem  arises  here  because  the  visual  system  processes 
input  at  different  spatial  frequency  bandwldths  (see  Shapley  and  Lennie,  1983). 
Higher  spatial  frequencies  correspond  to  more  light/dark  alternations  per 
degree  of  visual  angle;  thus,  higher  resolution  is  required  to  detect  higher 
spatial  frequencies.  The  system  can  be  described  as  having  a  number  of 
different  "channels,"  each  differing  in  resolution.  At  average  viewing 
distances,  the  lowest  spatial  frequency  channel  produces  an  output  that  will 
often  correspond  to  the  general  shape  envelope  of  an  object.  But  consider  what 
will  happen  at  higher  spatial  frequency  channels:  the  same  factors  that  result 
in  the  parse  of  the  object  from  the  background  will  result  in  parts  of  a  single 
object  (e.g.,  the  arms,  legs,  and  head  of  a  person)  being  parsed  from  one 
another.  That  is,  the  system  cannot  "know”  what  is  an  object  and  what  is  a 
part  of  an  object;  it  just  organizes  regions  in  the  input  on  the  basis  of 
physical  parameters  of  the  input  array.  And  herein  lies  a  difficulty:  Once 
parsed,  the  shape  representation  system  "ignores"  the  location  in  the  visual 
field  of  the  stimulus.  Thus,  the  representation  of  the  shapes  of  the  parts 
will  not  preserve  their  positions.  But  the  arrangement  of  parts  is  important 
for  many  recognition  tasks.  The  relations  must  be  represented  somehow. 

The  most  straightforward  solution  to  this  problem  requires  a  minor 
revision  to  the  Ungerleider  and  Mishkin  theory.  It  seems  clear  that  "what”  and 
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"where"  are  not  so  distinct  conceptually:  Sometimes  the  spatial  relations 
among  the  parts  are  critical  for  identifying  the  form;  for  example,  the 
difference  between  a  £  with  a  long  tail  and  an  <3  with  a  diagonal  slash  through 
it  is  a  matter  of  where  the  diagonal  is  positioned.  Rather  than  "where,"  the 
dorsal  system  seems  specialized  for  representing  spatial  relations.  Including 
those  among  parts  of  a  single  object.  The  relations  among  high- resolution 
representations  of  parts  presumably  are  represented  the  same  way  as  are  the 
spatial  relations  among  separate  objects  in  a  scene.  (However,  note  that  the 
parts  and  their  relations  are  also  implicit  in  a  low-resolution,  low  spatial 
frequency  representation;  for  example,  a  handle  of  a  mug  will  be  a  bulge  on  the 
blob-like  representation  of  the  mug.)  In  the  relevant  experiments,  animals 
have  never  been  required  to  discriminate  among  patterns  that  differ  only  in  the 
relations  among  parts;  usually  stimuli  differ  in  terms  of  numerous  features, 
and  the  relationships  among  them  are  not  important  (e.g.,  as  is  true  for  the 
square  and  plus  sign  used  by  Ungerleider  and  Mishkin,  1982b,  which  can  be 
discriminated  between  simply  by  looking  in  the  center  of  the  figure  and  seeing 
if  there  is  a  line).  In  short,  it  would  appear  that  once  figure  is  segregated 
from  ground,  regardless  of  whether  "figure"  is  an  object  or  part  thereof,  the 
location  of  that  figure  is  represented  in  the  parietal  lobes. 

3 .  The  Problem  of  Non-rigid  Transformations. 

We  can  gain  some  insight  into  the  way  spatial  relations  are  represented 
by  considering  another  problem  that  must  be  contended  with  by  a  visual  system: 
namely,  the  problem  that  many  objects  are  subject  to  a  near- infinite  number  of 
transformations,  and  so  may  not  look  the  same  from  instance  to  instance.  For 
example,  a  human  form  can  be  configured  in  a  huge  number  of  different  ways. 
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crouching,  arms  raised,  standing  on  one  toe  with  the  arms  held  out  to  the  side, 
and  so  on.  Similarly,  letters  of  the  alphabet  can  occur  in  numerous  fonts, 
which  are  not  simple  linear  transformations  of  each  other.  We  cannot  store  a 
separate  representation  of  all  the  possible  configurations  of  such  objects, 
with  the  aim  of  being  able  to  match  input  to  a  specific  stored 
representation — there  are  simply  too  many  possible  configurations,  and  one 
often  may  encounter  configurations  not  previously  seen.  Thus,  it  is  useful  to 
have  a  representation  that  will  be  stable  across  a  wide  range  of 
transformations.  Two  kinds  of  attributes  remain  constant  under  such 
transformations:  First,  the  Individual  parts  remain  the  same;  although  some  may 
be  hidden  depending  on  the  configuration,  no  parts  are  actually  added  or 
deleted  from  the  object.  Second,  the  topological  relations  among  parts  remain 
constant  under  all  of  these  transformations.  Topological  relations  are  more 
abstract  than  the  precise  relative  position  of  two  parts  as  they  appear  in  any 
given  case  (i.e.,  the  topographic  relations);  they  indicate  which  parts  are 
connected  to  each  other  and  which  are  contained  within  each  other.  For 
example,  the  topological  relation  between  the  arm  and  shoulder  remains  constant 
under  all  of  the  different  positions  the  arm  can  take.  However,  literally 
topological  relations  are  too  weak;  a  teacup  and  a  phonograph  record  are 
identical  under  a  topological  description.  The  relations  of  ears  to  the  side 
of  the  head,  or  the  thumb  to  a  hand,  are  important  and  will  remain  constant 
under  transformations.  Thus,  some  general  categories  of  relations,  such  as 
“left/right,"  "side  of,"  "connected  to  at  the  end,"  and  so  on,  must  be  used, 
not  the  actual  topographic  appearances. 

This  problem  places  requirements  on  what  the  dorsal  system  must  do: 
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this  system  must  be  able  to  derive  a  description  of  relations  which  will  remain 
true  under  a  large  number  of  ways  of  configuring  the  object.  These 
descriptions  themselves  cannot  be  images;  images  are  concrete,  being 
representations  of  specific  instances.  Instead,  the  dorsal  system  must  be  able 
to  make  use  of  more  abstract,  "categorical"  representations.  Such 
representations  capture  general  properties  of  a  relationship  without  specifying 
the  details  (e.g.,  "left  of”  without  specifying  how  much  or  exactly  what 
angle) . 

Finally,  it  would  seem  necessary  that  at  some  point  the  representations 
of  perceptual  units  and  their  relationships  must  come  together.  A  possible 
locus  of  that  nexus  is  the  association  cortex  near  Wernicke' s  area  (in  the 
posterior,  superior  temporal  lobe),  which  appears  to  be  involved  in  semantic 
processing.  However,  this  sort  of  arrangement  is  somewhat  awkward,  in  that  the 
relations  must  be  delivered  in  synchrony  with  the  related  units;  if  the  inputs 
fall  out  of  phase,  one  may  make  "illusory  conjunctions."  That  is,  one  may 
conjoin  units  using  the  wrong  relations.  Interestingly  enough,  Treisman  and 
Gelade  (1980)  report  just  such  illusory  conjunctions  when  the  system  is  pushed 
to  perform  well  in  a  difficult  task.  They  found  that  subjects  will 
occasionally  report  seeing  a  T  when  a  field  containing  instances  of  and  Z  was 
shown,  which  would  follow  if  the  vertical  line  was  mistakenly  conjoined  with  a 
horizontal  segment  from  the  Z. 

III.  PROCESSING  MODULES  USED  IN  IMAGERY 

Before  continuing,  it  will  be  useful  to  summarize  where  we  have  arrived 
so  far.  We  have  discussed  two  classes  of  mechanisms  used  in  vision.  The 
"ventral  system"  accesses  stored  representations  that  are  associated  with  an 
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individual  part  or  with  the  overall  shape  envelope.  The  representation  of 
shape  used  in  the  ventral  system  should  be  concrete,  capturing  the  precise 
^  shape  and  surface  details  of  the  part  or  object.  This  system  does  not  process 

relations  among  parts,  except  insofar  as  they  are  implicit  in  a  low  resolution 
representation  of  the  entire  object  (a  kind  of  blurry  silhouette).  In 
contrast,  we  are  led  to  assume  that  the  "dorsal  system”  must  be  able  to  derive 
abstract  "categorical”  representations  of  the  spatial  relations  among  parts  or 
objects.  These  sorts  of  representations  group  spatial  relations  into 
categories  that  are  characterized  by  the  presence  of  specific  relation  (e.g., 
"left  of,"  "above,"  "next  to”).  The  use  of  categorical  representations  of 
spatial  relations  is  especially  appropriate  for  classes  of  objects  whose 
members  are  subject  to  non-rigid  transformations.  In  these  cases,  the  parts 
can  be  arranged  in  a  large  number  of  topographical  configurations.  For 
example,  there  is  no  combination  of  uniform  linear  alterations  in  size, 
orientation  or  position  that  will  change  an  italic  version  of  an  upper  case 
letter  to  a  Times  Roman  or  Geneva  font  and  vice  versa,  or  that  will  map  each 
human  postural  configuration  into  every  other.  In  such  cases,  the  topological 
relations  must  be  abstracted  out  from  given  exemplars.  Finally,  we  have 
assumed  that  the  Information  about  units  and  relations  must  be  combined  at  a 
later  stage  in  processing,  and  that  a  rather  tight  linkage  must  be  maintained 
between  the  dorsal  and  ventral  systems. 

Image  Generation 

It  seems  safe  to  assume  that  visual  images  are  formed  on  the  basis  of 
representations  that  initially  were  encoded  during  perception.  If  so,  then  we 
are  in  a  position  to  exploit  the  analysis  presented  in  the  previous  sections  to 
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formulate  a  theory  of  the  processing  modules  used  in  imagery.  The  analysis  of 
visual  processing  has  direct  implications  for  a  theory  of  mental  image 
generation.  That  is,  images  are  not  always  in  mind;  when  appropriate,  they  are 
formed  on  the  basis  of  stored  information.  The  question  is,  how  are  they 
generated? 

Images  must  be  formed  on  the  basis  of  information  encoded  during 
perception.  This  stored  information  can  later  be  compared  against  new  input, 
and  hence  used  for  recognition.  Thus,  we  can  infer  that  Images  are  formed  on 
the  basis  of  Information  that  can  also  be  used  in  perceptual  recognition;  the 
stored  visual  representations  used  in  recognition  are  "concrete",  containing 
enough  information  to  allow  one  to  reconstruct  the  actual  appearance.  The 
process  that  activates  stored  visual  information  can  be  conceptualized  as  a 
processing  module  that  activates  representations  stored  in  memory  to  produce  a 
pattern  of  activation  in  a  "visual  buffer”  (this  pattern  of  activation  an 
image  representation) .  The  purpose  of  this  transformation  is  to  make  explicit 
the  spatial  properties  of  a  shape,  which  is  required  to  accomplish  the  purposes 
of  imagery  discussed  earlier.  The  "visual  buffer"  is  assumed  to  be  a 
functionally-defined  storage  medium  that  probably  corresponds  to  the  joint 
operation  of  numerous  topographically  organized  areas  of  cortex  (see  Van  Essen 
and  Maunsell,  1983).  Ue  assume  that  these  visual  parts  of  cortex  also  can  be 
activated  from  stored  information,  resulting  in  a  mental  image.  This  buffer  is 
equivalent  to  the  buffer  that  supports  Marr's  (1982)  "2  1/2  D  sketch"  in 
vision. 

When  we  see  patterns,  we  actively  organize  and  parse  them  into  separate 
perceptual  units,  and  these  units  are  stored  (e.g.,  see  Reed,  1974;  Reed  and 
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Johnsen,  1975).  Thus,  the  processing  module  that  activates  stored  visual 
information  would  activate  representations  of  a  previously-encoded  perceptual 
^  units.  Activating  a  stored  unit  could  result  in  an  image  of  a  single  part  or  a 

low-resolution  image  of  the  entire  object  (provided  that  such  a  unit  was 
encoded).  For  convenience,  let  us  call  this  module  the  PICTURE  processing 

t 

I  module . 

If  the  relations  among  units  are  stored  using  categorical 
representations,  then  other  modules  must  be  used  if  a  multipart  or  detailed 
object  is  to  be  imaged.  We  need  to  posit  a  processing  module  that  can  access 
the  descriptions  of  relations  and  use  them  to  juxtapose  separate  parts  in  the 
correct  relative  positions  in  an  image.  Such  a  module  would  look  up  and 
interpret  a  description  of  how  parts  are  to  be  arranged.  For  example,  in 
generating  a  detailed  image  of  a  car,  it  might  look  up  "front  wheel"  and 
discover  the  location  description  "under  front  wheelwell."  (Such  a  categorical 
representation  would  be  used  because  of  the  great  variability  in  the 
appearances  of  different  types  of  cars.)  For  convenience,  let  us  call  this  the 
PUT  processing  module. 

There  are  two  ways  in  which  a  description  could  specify  the  spatial 
relations  among  parts.  On  the  one  hand,  positions  could  be  specified  in  terms 
of  absolute  location.  On  the  other  hand,  positions  could  be  specified  relative 
to  other  parts.  If  objects  are  subject  to  non-rigid  transformations,  the 
absolute  location  of  parts  will  change  depending  on  the  configuration.  Thus, 
for  the  same  reasons  we  hypothesize  that  categorical  representations  are  used 
for  spatial  relations  during  recognition,  we  also  posit  that  these 
representations  specify  relative  locations:  such  representations  will  remain 
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constant  under  various  transformations.  For  example,  an  arm  is  connected  to  a 
shoulder  regardless  of  how  the  person  is  postured, 
i  Thus,  if  relative  positions  are  used,  then  one  must  know  the  location 

of  a  reference  point  in  order  to  add  a  part  to  a  multipart  image  (e.g.,  the 
"wheelwell,”  for  a  car’s  wheel,  or  the  shoulder  for  an  ana);  only  after 
i  locating  the  reference  point  will  one  be  able  to  position  another  part 

correctly  in  an  image.  In  order  to  locate  a  reference  point,  a  third  module 
must  be  used.  This  module  needs  to  search  for  a  specific  part,  which  is  one 
function  that  requires  the  ventral  visual  system.  For  convenience,  we  will 
call  this  module  the  FIND  processing  module.  The  PUT  processing  module  uses 
the  output  from  the  FIND  processing  module  (e.g.,  the  location  on  the  wheelwell 
on  the  car's  body,  where  the  front  wheel  belongs)  plus  the  description  of  the 
relation  ( "under")  to  compute  parameter  values  for  the  PICTURE  processing 
module,  allowing  it  to  form  an  image  of  the  new  part  in  the  correct  relation  to 
the  foundation  part. 

Image  Transformations 

Our  analysis  of  visual  image  transformations  begins  with  the 
observation  that  for  some  tasks  we  need  more  than  a  categorical  relation  among 
parts.  For  example,  in  navigating  in  the  dark  you  need  to  know  exactly  where 
various  pieces  of  furniture  are  located  in  the  room,  not  simply  their  relative 
positions.  Similarly,  in  recognizing  faces  you  need  to  know  the  metric  spatial 
relations  among  features,  not  simply  their  general  positions.  Thus,  we 
apparently  need  two  ways  of  representing  positions:  categorical  relations  and 
coordinate  positions  within  a  specific  frame  of  reference  (e.g.,  a  room  or 
face).  Image  transformations,  such  as  moving  the  position  of  an  object  in  an 
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Imaged  scene,  or  altering  its  orientation,  involve  changes  in  metric  spatial 
relations.  Thus,  they  presumably  require  altering  the  coordinate 
representations  of  locations. 

The  most  basic  finding  about  image  transformations  is  that  the 
transformation  process  typically  is  incremental;  the  process  alters  the 
representation  of  position  so  that  the  transformed  object  moves  through  a 
trajectory,  occupying  Intermediate  positions  as  it  is  being  transformed  (see 
Shepard  and  Cooper,  1982).  The  evidence  for  this  assertion  rests  on 
chronometric  data,  such  as  the  finding  that  more  time  is  required  to  rotate  or 
exapand  an  imaged  object  progressively  greater  amounts  (see  Figure  2).  We  can 
account  for  this  property  of  image  transformations  with  the  following 
assumptions:  1)  High- resolution  images  of  objects  are  composed  by  activating 
stored  encodings  of  distinct  parts.  This  assumption  follows  from  our  analysis 
of  image  generation.  2)  The  representations  of  the  locations  of  parts  are 
manipulated  individually  when  the  image  is  transformed.  This  idea  follows 
because  a  coordinate  representation  must  be  manipulated  when  one  needs  to  alter 
a  viewer-centered  repesentation,  such  as  by  changing  orientation  or  size; 
categorical  representations  do  not  embody  the  metric  spatial  relations  among 
objects  or  parts  (indeed,  such  representations  are  used  to  abstract  out  what  is 
constant  over  such  variations).  In  the  coordinate  representation,  the  location 
of  each  separate  part  is  specified  as  a  separate  representation.  3)  The 
behavior  of  the  brain  is  subject  to  random  perturbation.  This  observation  is 
true  of  all  physical  systems;  "noise”  is  pervasive.  4)  Therefore,  the 
locations  of  parts  of  an  imaged  object  are  not  altered  equally  with  each 
increment  of  transformation;  there  is  noise  in  the  movement  operation,  and  the 
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parts  become  misaligned.  5)  In  order  to  realign  the  locations,  there  must  be  a 
representation  of  spatial  relations  that  does  not  change  with  different 
^  coordinate  positions  of  the  parts.  I  have  argued  earlier  that  just  this  type 

of  "categorical”  relation  is  encoded  during  perception,  and  is  used  to  generate 
images  of  non-rigid  objects. 

Presumably,  the  amount  of  misalignment  is  proportional  to  the  size  of 
the  shift  (i.e.,  variability  is  usually  proportional  to  the  mean),  with  larger 
shifts  resulting  in  greater  scrambling.  This  notion  would  explain  why  images 
are  transformed  in  a  series  of  small  increments:  if  the  positions  are  too 
scrambled,  it  will  be  difficult  simply  to  identify  the  corresponding  parts  and 
to  use  stored  descriptions  to  realign  them. 

This  analysis  leads  us  to  posit  two  additional  processing  modules:  One 
module  is  required  to  alter  the  representation  of  the  positions  of  parts  of  an 
imaged  object.  We  presume  that  this  operation  involves  updating  coordinate 
representations  in  the  dorsal  system.  We  can  call  this  the  MOVE  processing 
module.  A  second  module  is  required  to  look  up  and  use  descriptions  of  the 
spatial  relations  to  direct  the  MOVE  module  to  realign  any  misaligned  parts. 

We  can  call  this  the  CLEANUP  processing  module.  Finally,  the  FIND  module  must 
also  be  used  in  image  transformations.  The  CLEANUP  module  must  make  use  of  the 
FIND  module  to  discover  the  current  locations  of  the  parts,  which  is  necessary 
before  it  can  compute  how  to  realign  them;  this  use  of  the  FIND  module  is 
analogous  to  its  use  by  the  PUT  module  during  image  generation.  (In  addition, 
in  some  tasks  the  FIND  module  presumably  is  used  to  locate  the  top  of  the 
object  to  provide  Information  about  the  shortest  way  to  rotate;  it  presumably 
also  is  used  to  make  the  requisite  judgment  when  the  object  has  been 
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transformed  far  enough.) 

One  way  we  sought  evidence  for  our  analyses  of  high-level  visual 
processing  was  by  looking  for  "functional  dissociations"  among  the  modules  we 
have  posited.  That  is,  we  wanted  to  show  that  the  two  cerebral  hemispheres  had 
different  abilities,  which  corresponded  to  having  different  sets  of  processing 
modules.  In  order  to  see  how  the  predictions  were  made,  however,  we  must 
understand  the  bases  for  our  expecting  different  modules  to  be  localized 
differently  in  the  brain.  Thus,  we  now  must  turn  to  that  aspect  of  the  theory. 

IV.  MECHANISMS  OF  HEMISPHERIC  DIFFERENTIATION 

The  theory  as  stated  so  far,  then,  posits  a  set  of  processing  modules 
and  some  types  of  representations  on  which  they  operate.  The  approach  we  have 
taken  suggests  that  the  functional  organization  of  the  system  is  a  consequence 
in  part  of  the  kinds  of  information-processing  problems  that  must  be  solved 
(c.f.,  Marr,  1982).  The  best  solutions  to  the  problems  presumably  influence 
functional  organization  in  two  ways:  First,  the  brain  presumably  evolved  to 
solve  those  problems  efficiently,  and  hence  the  evolution  of  brain  structure 
may  have  been  shaped  by  the  computational  problems.  Second,  the  experience  of 
the  individual  organism  in  dealing  with  the  problems  may  engender  the 
development  of  a  specific  functional  organization.  The  present  theory  is 
concerned  with  the  mechanisms  that  result  in  an  individual's  experiences 
shaping  the  functional  organization  of  the  two  hemispheres  of  the  brain.  This 
theory  rests  on  a  set  of  relatively  simple,  uncontroversial  properties  of  the 
brain;  what  is  original  here  is  putting  them  together  and  observing  the 
consequences  as  they  interact.  The  relevant  properties  are  as  follows: 

1.  Processing  components.  The  brain  is  functionally  organized  into  a 
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collection  of  separate  processing  modules.  This  claim  is  supported  by  the 
subtle  and  distinct  patterns  of  fractionation  that  are  evident  in  the 
I  behavioral  dysfunction  following  brain  damage  (e.g.,  see  Heilman  and 

Valenstein,  1979).  Presumably,  the  behavioral  deficit!,  reflect  damage  either 
to  some  of  the  modules  proper  and/or  to  their  interconnections  (c.f., 

Geschwind,  1965).  This  arrangement  makes  sense  if  the  system  evolved 
piecemeal,  with  new  components  being  added,  or  old  ones  being  modified,  to  work 
with  those  already  available.  A  modularized  system  is  easiest  to  alter,  as 
computer  programmers  have  long-since  discovered. 

2.  Exercise .  A  "processing  module"  is  a  functional  description  of  what 
is  done  by  a  neural  network.  If  such  a  network  is  subjected  to  the  same 
pattern  of  input  repeatedly,  the  subsequent  internal  pattern  of  activation  and 
subsequent  output  will  come  to  be  achieved  more  quickly  and/or  more  reliably. 
Such  "practice  effects"  presumably  reflect  actual  physical  changes  at  the 
cellular  level.  We  need  to  posit  something  like  this  principle  to  explain  the 
very  basic  finding  that  practice  improves  the  performance  of  even  the  simplest 
tasks.  This  idea  goes  back  at  least  as  far  as  Hebb  (1949).  In  short,  a 
processing  module  becomes  increasingly  efficient  at  carrying  out  a  specific 
computation  as  it  is  used  to  do  so  increasing  numbers  of  times. 

3.  Selectivity.  A  processing  module  can  only  operate  on  one  input  at  a 
time.  This  claim  is  almost  definitional,  given  that  a  given  neuron  can  only  be 
in  one  state  at  any  given  moment  in  time.  Hence,  a  set  of  neurons  can  only  be 
in  one  state  at  a  time,  including  those  neurons  that  serve  as  input  to  a  neural 
net.  We  can  regard  the  input  to  a  module  as  a  vector  (values  on  the  input 
lines  to  various  neurons).  If  so,  then  a  given  module  will  receive  only  one 


Visual  hemispheres  25 


vector  as  input  at  any  given  moment;  it  is  physically  impossible  to  have  two 
vectors  being  processed  simultaneously,  given  that  this  would  require  that  at 
least  one  input  neuron  be  in  two  states  simultaneously.  And  if  two  vectors 
were  intermixed,  it  would  be  impossible  to  sort  out  which  values  go  with  which 
input  vector  and  the  input  would  be  uninterpretable. 

4.  "Central"  bilateral  control.  Some  activities  involve  executing 
rapid  sequences  of  precise,  ordered  operations  that  extend  over  both  halves  of 
the  body.  In  such  cases,  one  does  not  want  to  have  the  operations  carried  out 
by  both  hemispheres;  given  the  physical  separation  of  the  hemispheres,  it  would 
be  difficult  to  keep  the  processes  synchronized.  This  idea  implies  that  a 
relatively  rapid  ordered  sequence  of  precise  operations  that  extend  over  Soth 
halves  of  the  body  will  be  controlled  in  a  single  locus.  For  example,  whe •>  one 
is  speaking  rapidly,  one  does  not  want  to  have  to  control  the  left  and  right 
sides  of  the  speaking  apparatus  separately,  synchronizing  two  t.ets  of  commands. 
Thus,  the  area  that  controls  sp^-ch  output  is  on  only  one  side  of  the  brain 
(typically  the  left)  and  is  situated  near  the  motor  strip  (precentral  gyrus). 
Similarly,  in  programming  rapid  shifts  of  attention  over  an  object  or  scene, 
one  does  not  want  to  have  to  coordinate  corresponding  operations  in  the  two 
sides  of  the  brain.  Thus,  in  most  right-handed  males  the  right  parietal  region 
appears  to  have  a  special  role  in  directing  attention  (for  a  review  see  chapter 
4  of  De  Renzi,  1982). 

I  had  originally  thought  that  these  four  properties  of  the  brain 
operating  together  would  be  sufficient  to  produce  functional  differentiation 
between  the  two  hemispheres.  The  theory  was  as  follows:  First,  I  assumed  that 
differentiation  develops  over  age  and  experience,  as  a  child  has  occasion  to 


Visual  hemispheres  26 


develop  and  use  language  and  to  learn  his  or  her  way  about  in  the  environment. 
The  mechanism  underlying  this  sort  of  differentiation  depends  critically  on  the 
property  of  "bilateral  control”.  This  property  leads  us  to  expect 
innately-determined  assymetries  in  modules  that  utilize  rapid,  precisely 
ordered  bilateral  sequences  of  operations.  (I  do  not  have  a  theory  as  to  why 
some  of  these  modules  are  predominantly  stronger  on  the  left  side  whereas  other 
modules  are  predominantly  stronger  on  the  right  side.)  These  Innately 
lateralized  modules  putatively  come  to  serve  as  the  initial  "seeds"  (in  a 
catalytic  sense)  in  the  differentiation  process.  For  example,  consider  first 
differentiation  in  "categorical”  representation  and  use:  An  initial  "seed” 
module,  which  is  innately  stronger  on  the  left  side,  is  a  "speech  output  area”. 
This  area  coordinates  the  mouth,  tongue,  and  vocal  apparatus  to  produce 
phonemes.  Given  this  beginning  asymmetry,  I  thought  that  the  other  three 
principles  would  produce  a  snowball  effect:  Initially,  there  are  modules  in 
both  hemispheres  that  produce  output  used  in  making  speech  sounds  (e.g.,  that 
set  up  programs  to  order  sounds).  These  modules  apparently  are  located  In  what 
is  called  Broca's  area  on  the  left  side.  The  modules  on  the  left  side  are 
relatively  close  to  the  speech  output  area,  and  hence  will  be  selected  over  the 
corresponding  modules  in  the  right  hemisphere.  That  is,  the  additional  time 
necessary  for  the  trip  over  the  corpus  callosum  from  the  right  hemisphere  will 
result  in  the  left-hemisphere  processing  modules  being  selected  more  often  than 
the  corresponding  ones  in  the  right  hemisphere.  This  sequence  of  events  will, 
via  the  property  of  exercise,  result  in  speech-related,  and  then 
language-related,  modules  becoming  stronger  in  the  left  hemisphere.  Once  these 
modules  are  stronger  on  the  left  side,  the  effect  then  compounds:  now  these 
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modules  play  the  same  role  as  did  the  speech  output  area  in  the  differentiation 
process.  Modules  that  make  use  of  these  previously-differentiated  modules 
!  should  themselves  become  strengthened  in  the  left  hemisphere.  The  result  of 

this  snowball  effect  should  be  that  modules  that  make  use  of  categorical 
representations,  which  can  be  manipulated  by  rule  systems  (such  as  those  used 
I  in  language  and  arithmetic),  will  in  general  become  stronger  in  the  left 

hemisphere.  This  notion  is  supported  by  evidence  from  studies  of  stroke 
patients  and  split-brain  patients  that  the  left-hemisphere  has  a  special  role 
in  arithmetic  and  Inference  (see  Heilman  and  Valenstein,  1979). 

The  theory  as  stated  so  far  turned  out  to  be  inadequate  when  we  got  to 
the  stage  of  actually  building  a  computer  simulation  model.  Two  deficlences 
became  apparent  almost  Immediately:  First,  given  that  the  modules  that  send 
information  to  the  "seed"  module  (which  is  lateralized  on  one  side)  have 
produced  an  output,  they  have  been  exercised;  they  do  not  "know"  whether  or  not 
their  output  was  used.  Thus,  both  the  corresponding  left  and  right  modules  are 
strengthened;  the  system  will  not  lateralize  as  planned.  Second,  what  is 
stopping  the  output  from  one  module  from  interrupting  the  output  from  another? 
That  is,  although  the  property  of  selectivity  ensures  that  the  output  from  only 
one  module  will  serve  as  input  to  another  at  any  given  time,  there  was  no 
reason  why  a  late-coming  input  could  not  supplant  a  previous  one.  The  problem 
here  is  that  if  enough  competing  Inputs  are  present,  the  target  module  may 
never  receive  consistent  input  long  enough  to  be  able  to  use  it.  A  neural 
network  will  not  produce  a  consistent  output  unless  the  input  is  maintained 
over  a  period  of  time.  That  is,  it  takes  time  for  a  network  to  settle  into  a 
stable  pattern  of  activity,  and  the  same  input  must  be  maintained  during  the 
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course  of  this  settling  down  process  (e.g.,  see  Ackley,  Hinton  and  Sejnowski, 
1985).  (Technically,  the  input  lines  must  be  "clamped"  long  enough  for  the 
network  to  settle  into  equilibrium.)  If  the  network  is  not  at  equilibrium,  it 
will  not  systematically  produce  a  single  output  when  given  an  input. 

Fortunately,  a  fifth  property  of  the  brain,  previously  not  considered 
relevant,  seems  to  solve  both  problems  for  us: 

5.  Reciprocal  innervation.  A  fundamental  fact  about  the  visual  system 
is  that  most  of  the  pathways  have  both  afferent  and  efferent  tracks  (Van  Essen 
and  Maunsell,  1983).  This  property  of  the  system  could  be  used  as  a  way  of 
maintaining  an  input  long  enough  for  a  network  to  make  use  of  it.  In  order  to 
maintain  an  input,  the  efferent  pathways  could  be  used  as  a  feedback  loop, 
stimulating  the  source  of  the  input.  That  is,  the  brain  is  not  a  digital 
computer;  it  does  not  pass  discrete  symbols  back  and  forth.  Rather,  we  assume 
that  modules  produce  patterns  of  activity,  which  can  be  sustained  if  the  module 
is  driven  to  do  so.  While  the  output  from  one  sending  module  is  being 
sustained,  the  target  module  cannot  receive  the  output  from  another  module;  the 
sending  and  target  modules  are  "locked"  into  a  loop.  Thus,  the  property  of 
selectivity  can  be  regarded  as  a  consequence  of  the  property  of  reciprocal 
Innervation,  and  need  not  be  treated  as  a  separate  property  of  the  mechanism. 

A  snowball  theory  of  differentiation 

Thus,  the  mechanism  of  differentiation  just  described  becomes  modified: 
once  an  input  arrives  at  a  module  and  is  entered  into  it,  tjie  output  from  the 
sending  module  will  be  sustained.  If  so,  then  only  the  module  that  produces 
output  that  arrives  to  the  target  module  first  will  be  stimulated,  and  hence 
exercised.  Furthermore,  the  feedback  loop  can  serve  to  "lock  in"  an  input 
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until  the  network  has  settled,  which  solves  the  problem  of  multiple 
Interruptions.  Thus,  the  theory  rests  on  the  ideas  that  a)  "seed”  modules  are 
initially  lateralized;  b)  when  the  output  from  a  module  serves  as  input  to  a 
seed,  feedback  from  the  seed  module  drives  that  sending  module  until  the  seed 
has  settled  (i.e.,  interpreted  the  input);  c)  during  this  period,  the  sending 
module  is  exercised,  resulting  in  its  becoming  faster  and  more  noise  resistant 
when  used  in  the  same  way  in  the  future.  Only  the  module  whose  output  is 
actually  used  becomes  exercised;  the  output  from  the  corresponding  one  on  the 
other  side  is  not  used  (is  selected  against),  and  hence  it  is  not  driven  to 
remain  in  the  output  state  long  enough  to  become  exercised.  This  process  is 
repeated  with  "second-order  seeds",  modules  which  themselves  are  not  Innately 
lateralized  but  that  become  so  during  the  course  of  experience. 

This  mechanism,  then,  will  result  in  the  left  hemisphere's  becoming 

specialized  for  using  categorical  representations.  These  representations  are 

well-suited  for  specifying  pairwise  relations  among  parts  (e.g.,  a  hand  is 

connected  to  a  wrist,  a  wrist  to  a  forearm,  etc.).  They  are  not  well-suited, 

however,  for  representing  graded,  topographic  information  nor  for  representing 

locations  in  absolute  space.  In  most  people,  the  right  hemisphere  appears  to 

have  become  relatively  specialized  for  these  functions  (see  De  Renzi,  1982). 

The  theory  for  this  development  is  analogous  to  the  theory  of  development  of 

categorical  representation,  only  in  this  case  the  property  of  central  bilateral 

control  underlies  a  unilateral  locus  for  one  component  of  our  attentional 

mechanism,  namely  that  Involved  in  directing  shifts  in  attention.  (We  must  be 

careful  here,  however;  at  present  there  are  at  least  four  loci  implicated  in 

attentional  shifts — the  right  parietal,  frontal  eye  fields,  superior 
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colliculus,  and  reticular  activating  system;  see  Mesulam,  1981.) 

The  present  claim,  then,  is  that  the  parietal  lobe  represents  spatial 
relations,  and  can  do  so  in  two  ways:  in  terms  of  a  given  category  of  spatial 
relation  or  in  terms  of  coordinate  points  in  space.  Because  metric  information 
is  necessary  for  planning  attentional  shifts,  the  modules  that  produce  the 
metric  map  become  exercised  and  stronger  in  the  right  hemisphere.  Thus,  the 
proposed  mechanism  will  result  in  the  right  parietal  lobe  coming  to  represent 
spatial  relations  by  using  coordinate  points  in  a  metric,  analog  "map"  of  where 
objects  fall  in  the  space  around  an  organism;  in  contrast,  the  left  parietal 
lobe  will  come  to  represent  relations  categorically.  This  conjecture  allows  us 
to  explain  much  data  on  the  right  hemisphere's  superior  topographic  ability 
(see  Byrne,  1982)  and  ability  to  represent  metric  location  (e.g.,  see  Kimura, 
1969).  The  topographic  Information  in  the  right  hemisphere  is  especially 
critical  for  purposes  of  navigation:  Simply  knowing  that  one  object  is  not 
connected  to  another  does  nor  tell  you  if  there  is  enough  room  between  them  to 
put  your  foot.  Thus,  modules  concerned  with  representing  space  for  use  in 
navigation  also  will  come  to  be  stronger  in  the  right  hemisphere. 

The  theory  as  developed  so  far  posits  that  the  ventral  visual  system  is 
not  laterallzed;  it  is  simply  duplicated  on  both  sides  of  the  brain.  This 
duplication  occurs  because  the  output  is  used  directly  in  both  hemispheres,  and 
so  both  sets  of  modules  are  exercised.  In  the  left  hemisphere,  however,  the 
output  is  converted  to  categorical  representations  (perhaps  using  "symbols,'* 
with  conversion  occuring  via  the  polysensory  association  areas  and  then 
Wernicke's  area);  in  the  left  hemisphere  the  relations  among  parts,  putatively 
processed  in  the  dorsal  visual  system,  are  represented  in  a  categorical  format. 
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In  the  right  hemisphere,  in  contrast,  the  output  from  the  ventral  system  is  not 
converted  into  a  categorical  form.  Rather,  usually  there  may  not  be  an  analog 
I  to  Wernicke's  area,  which  converts  input  to  a  semantic  form.  (A  rare 

individual  may  violate  this  generalization,  however;  see  De  Renzl,  1982.) 
Furthermore,  the  relations  represented  in  the  right  hemisphere  dorsal  system 
I  will  not  be  categorical.  Rather,  objects'  locations  are  specified  in  a 

topographic  map  of  space,  and  the  relations  are  relative  to  a  single  reference 
point  (the  origin  of  the  coordinate  space);  inter-part  relations  are  only 
indirectly  specified.  The  possibility  of  representing  spatial  relations  in  two 
ways  suggests  that  the  hemispheres  will  have  different  roles  in  performing 
spatial  tasks. 

Lateralization  of  Imagery  Processing  Modules 
The  theory  of  hemispheric  differentiation  leads  us  to  expect  that  the 
PUT  processing  module  will  be  more  efficient  in  the  left  hemisphere.  This 
prediction  follows  because  this  module  must  access  and  interpret  categorical 
representations.  Presumably,  the  intepretation  of  these  representations  makes 
use  of  other  modules  that  are  also  recruited  in  language  processing.  If  so, 
then  the  snowball  theory  described  above  will  result  in  the  PUT  module  becoming 
more  effective  on  the  same  side  as  these  modules.  In  contrast,  the  PICTURE 
module  should  be  equally  effective  in  both  hemispheres.  The  visual 
representations  on  which  it  operates  are  duplicated  in  both  temporal  lobes  (see 
Gazzaniga,  1970).  Thus,  there  is  no  reason  to  think  that  the  activation  of 
these  stored  representations  should  be  favored  on  one  side  over  the  other. 
Similarly,  the  FIND  processing  module  should  be  equally  effective  in  both 
hemispheres.  The  visual  buffer  is  duplicated  in  both  hemispheres  (each 
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representing  the  contralateral  hemifield),  and  the  FIND  module  should  be  used 
equally  often  in  processing  each  side. 

The  theory  also  allows  us  to  make  predictions  about  the  ease  of  mental 
rotation  in  the  two  hemispheres.  We  expect  the  two  hemispheres  to  be  involved 
in  different  ways  in  mental  rotation.  As  was  discussed  earlier,  the  rotation 
of  complex  figures  should  require  use  of  descriptions  to  realign  the  parts  as 
they  become  scrambled  during  the  rotation  process.  The  CLEANUP  (realign) 
module  will  become  more  effective  in  the  left  hemisphere  for  the  same  reasons 
that  the  PUT  module  should  become  more  effective  in  the  left  hemisphere.  In 
contrast,  the  repositioning  operation  performed  by  the  MOVE  module  depends  on 
altering  the  topographic  representation  of  the  layout  of  individual  parts.  The 
modules  that  produce  this  representation  are  more  effective  in  the  right 
hemisphere.  The  theory  thus  predicts  that  the  MOVE  module  will  become  more 
effective  in  the  right  hemisphere. 

IV.  NEUROPSYCHOLOGICAL  EVIDENCE  FOR  LATERALIZATION 
OF  THE  IMAGE  GENERATION  MODULES 

If  our  theory  of  the  functional  organization  of  high-level  vision  is 
correct,  we  should  be  able  to  find  a  sort  of  brain  damage  that  leaves  some  of 
the  modules  intact  while  disrupting  the  others.  Furthermore,  if  the  theory  of 
lateralization  is  correct,  we  should  be  able  to  discover  selective  deficits  in 
the  two  hemispheres.  Thus,  we  began  by  Investigating  a  counterintuitive 
prediction,  namely  that  the  left  hemisphere  should  be  better  than  the  right  at 
selected  imagery  tasks.  This  prediction  rests  on  there  being  a  dissociation 
between  the  three  modules  purportedly  used  to  generate  visual  mental  images 
(for  details  see  Kosslyn,  Holtzman,  Farah,  and  Gazzaniga,  in  press).  We  sought 
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a  dissociation  between  the  FIND  and  PICTURE  modules  on  the  one  hand,  and  the 
PUT  processing  module  on  the  other.  The  idea  was  that  the  PUT  processing 
module  involves  manipulation  and  use  of  categorical  representations  of  the  type 
used  in  linguistic  processing,  and  hence  our  "snowball  theory"  led  us  to  expect 
that  such  processing  would  typically  be  localized  in  the  left  hemisphere  (the 
locus  of  linguistic  abilities,  at  least  for  most  right  handed  males).  In 
contrast,  the  theory  leads  us  to  expect  the  FIND  and  PICTURE  modules  to  be 
equally  effective  in  both  hemispheres  (as  was  discussed  in  the  previous 
section).  Thus,  we  hoped  to  show  that  the  left  hemisphere  could  perform  image 
generation  tasks  that  involved  all  three  modules,  whereas  the  right  hemisphere 
could  perform  tasks  that  required  only  the  PICTURE  and  FIND  modules.  This 
prediction  was  especially  interesting  because  the  common  wisdom  has  it  that  the 
right  cerebral  hemisphere  is  the  seat  of  mental  imagery  (e.g.,  see  Springer  & 
Deutch,  1981;  Ley,  1979;  Ehrllchman  &  Barrett,  1983).  Thus,  if  we  can  show 
that  the  left  hemisphere  is  actually  able  to  perform  a  wider  range  of  imagery 
tasks,  this  will  be  particularly  dramatic  evidence  of  the  usefulness  of  the 
computational  approach. 

Imagery  validation.  We  first  showed  that  imagery  was  required  to 
perform  a  task  that  putatively  recruited  all  three  image  generation  modules. 

The  task  was  to  decide  from  memory  whether  upper  case  letters  of  the  alphabet 
were  composed  of  all  straight  lines  (e.g.,  K,  L)  or  had  some  curved  lines 
(e.g.,  B,  R).  Our  demonstration  used  the  selective  interference  logic 
developed  by  Brooks  (1967),  Segal  (1971),  and  others.  These  researchers  showed 
that  imaging  and  perceiving  in  the  same  modality  interfere  with  each  other  more 
than  do  imaging  in  one  modality  (e.g.,  visualizing  a  flower)  and  perceiving  in 
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another  (e.g.,  listening  for  a  tone).  We  used  a  technique  developed  by  Brooks: 
He  asked  subjects  to  visualize  block  letters  and  then  to  classify  the  corners 
|  (working  clockwise  around  the  letter)  according  to  whether  they  were  on  the 

extreme  top  or  bottom  of  the  letter.  For  each  corner,  subjects  were  to  respond 
by  either  saying  "yes”  or  "no”  aloud  (as  appropriate)  or  by  pointing  to  "yM  or 
"n"  on  a  page,  working  down  crooked  columns  of  the  letters  over  the  course  of 
the  task.  Having  to  look  for  and  point  to  the  letters  was  much  more  difficult 
in  this  task  than  merely  saying  "yes"  or  "no".  In  contrast,  in  another  task 
subjects  formed  auditory  images  of  spoken  sentences  and  decided  if  each  word 
was  a  noun  or  not.  Now  saying  the  responses  was  harder  than  pointing  to  them. 
The  idea  was  that  visual  perception  Interfered  more  with  visualizing,  but 
talking  (and  hearing)  interfered  more  with  auditory  imaging. 

We  made  use  of  Brooks'  task  to  garner  evidence  that  the  straight/curved 
letter  judgment  requires  imagery.  College  students  read  down  a  column  of  lower 
case  letters,  and  made  the  straight/curved  judgment  about  the  corresponding 
upper  case  versions.  These  subjects  were  asked  to  respond  either  by  putting  a 
check  mark  in  the  appropriate  location  on  the  page  (which  required  looking  for 
the  place  to  respond)  or  by  saying  the  response  aloud.  Looking  and  making 
check  marks  required  more  time,  even  though  making  check  marks  in  isolation 
actually  took  less  time  than  saying  the  response.  These  results  in  conjunction 
with  the  earlier  findings  implicated  imagery  in  this  task  (see  Kosslyn, 
Holtzman,  Farah  and  Gazzaniga,  1984). 

The  next  task  was  to  demonstrate  that  Images  of  upper  case  letters  are 
generated  a  segment  at  a  time.  This  was  important  because  the  theory  says  that 
the  PUT  processing  module  is  only  used  when  separate  parts  must  be  amalgamated 
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into  a  composite  image.  We  reasoned  that  people  have  seen  so  many  upper  case 
letters  that  they  do  not  image  a  particular  one  actually  seen  when  asked  to 
|  image  a  letter.  When  the  reader  Imagines  an  upper  case  "a”,  it  probably  is  not 

one  actually  seen  at  some  point  in  the  past;  rather,  a  "prototypical”  A  is 
probably  imaged.  We  assume  that  the  characteristic  features  of  A's  have  been 
I  abstracted  out  and  stored  as  a  description,  something  like  "two  slanted  lines 

of  roughly  equal  length  meeting  at  the  top,  joined  roughly  halfway  down  by  a 
horizontal  line”.  (Although  English  is  used  to  write  the  description  here, 
some  other  code  may  be  used  in  the  brain;  perhaps  the  descriptions  are  stored 
more  like  instructions  in  a  computer.)  To  test  the  claim  that  images  of 
letters  are  created  a  segment  at  a  time,  Kosslyn,  Backer  and  Provost 
(submitted)  showed  subjects  two  x  marks  in  an  otherwise  empty  4x5  grid  and 
asked  them  whether  both  x' s  would  fall  on  a  given  upper  case  letter  if  it  were 
present  in  the  grid  (as  the  letter  appeared  when  it  was  actually  presented 
previously).  If  the  segments  are  imaged  individually,  then  some  will  be 
present  before  others.  If  so,  then  the  time  to  affirm  that  the  x  marks  would 
fall  on  the  letter  will  depend  on  the  location  of  the  segments  on  which  they 
fell.  And  Indeed,  the  location  of  the  x  marks  proved  to  be  critical  in  the 
Imagery  condition,  with  more  time  being  required  for  marks  that  fell  on 
segments  located  towards  the  end  of  the  letter  (i.e. ,  towards  the  end  of  the 
sequence  of  drawing  it) .  A  number  of  controls  were  used  to  ensure  that  image 
generation,  and  not  image  inspection  after  the  letter  was  formed,  was 
responsible  for  the  effects. 

The  upshot  of  the  preliminary  work,  then,  was  that  images  of  upper  case 
letters  are  usually  generated  a  part  at  a  time — which  requires  descriptive 


Visual  hemispheres  36 


relations  among  the  parts,  according  to  our  theory— and  that  imagery  is  used  to 
decide  if  named  upper  case  letters  have  any  curved  lines.  The  weak  link  here, 
of  course,  is  the  assumption  that  because  letters  are  imaged  a  part  at  a  time 
they  must  be  imaged  on  the  basis  of  a  stored  description.  There  is  a 
computational  argument  in  support  of  this  assumption,  based  on  the  large 
variability  among  instances  of  letters  (as  noted  above),  but  it  is  enough 
simply  to  point  out  that  if  the  experiments  had  not  come  out  as  predicted  it 
may  have  been  because  this  assumption  was  faulty. 

An  Imagery  Deficit 

Thus,  we  began  by  investigating  whether  both  hemispheres  of  patient 
J.W.  could  perform  the  straight/curved  imagery  task.  J.W.  had  his  corpus 
callosum  sectioned  about  3  years  prior  to  our  testing  because  of  severe 
intractable  epilepsy;  he  has  been  extensively  tested  and  his  right  hemisphere 
is  capable  of  comprehending  involved  verbal  instructions  and  of  making  simple 
deductions  and  classifications  (see  Sidtls  et  al.,  1981,  for  further  details). 

In  order  to  isolate  performance  to  a  single  cerebral  hemisphere,  we 
asked  J.U.  to  stare  straight  ahead  at  an  asterisk  on  a  screen  and  flashed  lower 
case  letters  to  the  left  or  the  right  side  of  this  fixation  point.  As  is 
illustrated  in  Figure  6,  the  construction  of  the  retina  and  optic  nerve  ensures 
that  a  lateralized  stimulus  is  exposed  to  only  one  hemisphere,  given  that  the 
corpus  callosum  is  severed  and  hence  interhemispheric  conmunication  is 
precluded.  We  asked  J.W.  to  look  at  the  lower  case  letter  and  to  decide  if  the 
upper  case  version  had  any  curved  lines.  He  pressed  one  button  if  he  thought 
the  upper  case  letter  had  curves,  and  another  if  he  thought  it  had  only 
straight  lines.  The  left  arm  was  used  for  all  responding  (due  to  ipsilateral 
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efferents,  both  hemispheres  can  control  the  major  arm  movements;  fine  motor 
movements  are  controlled  only  by  the  contralateral  hemisphere). 


INSERT  FIGURE  6  ABOUT  HERE 

The  task,  then,  requires  use  of  seven  abilities  (each  of  which  is 
presumably  carried  out  by  a  host  of  processing  modules):  First,  the  lower  case 
letter  must  be  ENCODED.  Second,  it  must  ACCESS  the  representation  of  the  upper 
case  version.  Third,  the  image  of  the  upper  case  version  must  be  GENERATED. 
Fourth,  the  image  must  be  RETAINED  long  enough  to  be  judged.  Fifth,  the  image 
must  be  INSPECTED.  Sixth,  a  JUDGMENT  must  be  made.  Finally,  a  RESPONSE  must 
be  produced.  Our  first  goal  was  to  demonstrate  that  the  right  hemisphere  had  a 
deficit  in  performing  the  task,  and  then  to  show  that  it  was  due  to  a  problem 
in  GENERATION  per  se.  Following  this,  we  sought  to  implicate  a  specific 
dissociation  between  the  PUT  processing  module  and  the  other  two  processing 
modules . 

The  results  from  the  first  sets  of  trials  were  straightforward:  J.W.’s 
left  hemisphere  made  straight/curved  judgments  virtually  perfectly,  but  his 
right  hemisphere  was  almost  at  chance  performance.  A  number  of  control 
experiments  were  conducted  to  implicate  a  deficit  in  image  generation  per  se. 
First,  we  lateralized  upper  case  letters  and  asked  J.W.  to  perform  the  judgment 
on  the  actual  stimuli.  Both  hemispheres  were  virtually  perfect.  Thus,  both 
hemispheres  could  ENCODE  the  letters,  INSPECT  them,  make  the  JUDGMENT,  and 
produce  correct  RESPONSES.  In  another  control,  we  lateralized  the  lower  case 
versions,  and  simply  asked  J.W.  to  select  the  corresponding  upper  case  version 
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from  the  alphabet,  which  was  displayed  in  free  view.  Both  hemispheres  were 
virtually  perfect;  thus  the  right  could  ACCESS  the  cross-case  representation. 
Indeed,  both  hemispheres  could  even  draw  the  upper  case  letters  (using 
contralateral  hands)  after  seeing  the  lower  case  cue,  even  when  the  hand  was 
obscured  from  view  (drawing  under  a  table). 

In  another  control,  we  lateralized  3-letter  words;  the  words  were 
composed  of  upper  case  letters  (e.g.,  MUG).  J.W.  was  then  cued  two  seconds 
later  as  to  which  letter  (first,  second,  or  third)  to  classify  as  being 
straight  or  curved.  Both  hemispheres  could  do  this  task;  in  fact,  the 
hemispheres  performed  as  well  as  when  the  cue  was  given  beforehand,  and  no 
imagery  was  required.  Thus,  the  problem  was  not  that  the  right  hemisphere 
could  not  RETAIN  the  image  long  enough  to  Inspect  it,  nor  was  the  problem  that 
it  could  not  INSPECT  images. 

At  this  point  we  had  demonstrated  that  the  right  hemisphere  could 
perform  all  of  the  sub-tasks  except  image  GENERATION.  However,  other  possible 
accounts  needed  to  be  eliminated.  Perhaps  the  right  hemisphere  could  not 
combine  sub-tasks.  Thus,  in  another  control,  we  showed  pairs  of  letters,  one 
upper  case  and  one  lower  case  (both  drawn  at  the  same  size);  on  half  the  trials 
the  upper  case  was  on  the  left  side,  and  on  half  it  was  on  the  right  side.  The 
slides  were  lateralized,  so  that  only  one  hemisphere  saw  the  pair.  Ve  asked 
him  to  point  to  the  upper  case  version  and  to  classify  it.  His  right 
hemisphere  clearly  knew  the  differences  between  cases,  and  could  do  a  two-step 
task.  The  word-retention  task  described  in  the  previous  paragraph  also 
required  integrating  multiple  steps  (ENCODING  the  stimulus,  RETAINING  and 
image,  SELECTING  the  correct  letter,  INSPECTING  the  image,  JUDGING  the  shape. 
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and  RESPONDING). 

To  investigate  possible  "Stroop"  sorts  of  interference,  where  the  lower 
case  stimulus  somehow  interfered  with  the  upper  case  task,  we  read  him  letters 
aloud  (and  thus  both  hemispheres  knew  which  letter  was  being  queried).  Now  the 
pair  ”X  0"  or  the  pair  "0  X”  was  presented  to  a  single  hemisphere.  The  task 
now  was  to  point  to  the  place  on  the  screen  where  the  X  had  been  if  the  upper 
case  letter  had  only  straight  lines  and  to  the  location  where  the  0  had  been  if 
the  upper  case  letter  had  any  curves.  Again,  the  left  hemisphere  was  virtually 
perfect,  whereas  the  right  hemisphere  was  at  chance.  Thus,  the  right 
hemisphere's  poor  performance  was  not  due  to  a  conflict  between  the  visible 
lower  case  version  ^.»d  the  Imaged  upper  case  version. 

Both  hemispheres  could  perform  the  judgment  on  visible  stimuli,  could 
perform  it  when  the  image  was  simply  retained  from  external  input,  could  make 
the  association  between  upper  and  lower  case,  and  could  perform  tasks  of 
similar  complexity  involving  selecting  a  case  and  making  the  judgment.  And  the 
right  hemisphere's  difficulty  in  performing  the  imagery  task  did  not  appear  to 
be  in  understanding  the  instructions;  the  other  multistage  tasks  had  comparably 
difficult  instructions,  and  J.W.'s  right  hemisphere  has  been  shown  to 
understand  complex  instructions  in  other  tasks  (see  Sidtis  et  al,  1981).  Nor 
did  its  problem  lie  in  combining  subtasks,  or  in  having  interference  from  the 
lower  case  stimuli  themselves.  It  appeared  that  J.W.'s  right  hemisphere  simply 
could  not  generate  images  of  the  letters. 

A  Selective  Imagery  Deficit 

The  results  described  so  far  serve  only  to  demonstrate  an  image 
generation  deficit  in  J.W.'s  right  hemisphere.  They  do  not  implicate  a  deficit 
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in  one  processing  module  or  another.  In  order  to  implicate  a  deficit  in  using 
the  PUT  processing  module  per  se,  we  needed  to  show  that  both  hemispheres  could 

j  use  the  PICTURE  and  FIND  processing  modules.  These  are  the  only  processes 

required  to  generate  images  of  the  general  shape  of  an  object,  if  such  a  shape 
was  encoded  as  a  single  low-spatial  frequency  perceptual  unit.  And  in  fact, 
we  discovered  that  both  of  J.W.'s  hemispheres  could  perform  tasks  requiring  the 
imaging  of  overall,  general  shapes:  In  one,  we  asked  each  of  J.W.'s 
hemispheres  to  decide  which  of  two  similar-sized  objects  (e.g.,  goat  vs.  hog) 
was  the  larger,  a  task  previously  shown  to  require  imagery  (see  chapter  9, 
Kosslyn,  1980).  J.W.  stared  at  a  central  fixation  point,  and  a  word  was 
presented  to  one  side  or  the  other.  If  a  goat  was  larger  than  the  animal  named 
by  the  word,  he  pushed  one  button;  if  the  word  named  an  animal  larger  than  a 
goat,  he  pushed  the  other  button.  Both  hemispheres  performed  this  task 
virtually  perfectly,  with  no  difference  in  either  the  error  rates  or  response 
times.  He  also  could  decide  equally  well  in  both  hemispheres  whether  a  named 
object  was  higher  than  it  is  wide.  These  tasks  require  the  PICTURE  processing 
module  to  generate  Images  of  the  general  shapes  of  the  objects,  and  the  FIND 
processing  module  to  inspect  the  objects  in  the  image.  Because  no  parts  need 
to  be  added  to  the  general  shape  to  perform  either  task,  the  PUT  module  is  not 
necessary.  (Incidentally,  these  results  are  also  important  because  they  show 
that  the  right  hemisphere's  problem  is  not  simply  in  processing  letters,  which 
are  linguistic  materials.) 

We  also  conducted  another  task  that  should  require  the  PUT  module. 

This  task  made  use  of  exactly  the  same  materials  used  in  the  size-judgment 
task,  which  both  hemispheres  could  perform  equally  well.  Now  J.W.  was  asked  to 
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decide  if  Che  named  animals  did  or  did  not  have  ears  that  protrude  above  the 
top  of  the  skull  (e.g.,  an  ape  and  a  sheep  do  not,  a  cat  and  a  mouse  do).  One 
response  button  was  labeled  with  an  inverted  U  (representing  the  top  of  the 
animal* 8  head)  with  a  triangle  sticking  above  it;  the  other  had  the  inverted  U 
with  a  small  u  hanging  down.  The  names  of  the  animals  were  presented  to  the 
Individual  hemispheres,  and  they  categorized  their  ears  (J.W.  lived  on  a  farm 
and  is  quite  familiar  with  animals).  The  left  hemisphere  made  very  few  errors, 
whereas  the  right  hemisphere  performed  at  chance. 

These  results,  then,  showed  that  J.W.'s  right  hemisphere  could  perform 
imagery  tasks  that  require  use  only  of  the  PICTURE  and  FIND  modules,  but  could 
not  perform  tasks  that  also  require  use  of  the  PUT  module. 

V.  NEUROPSYCHOLOGICAL  EVIDENCE  FOR  THE  TRANSFORMATION  MODULES 

The  theory  leads  us  to  expect  that  the  right  hemisphere  will  be  better 
at  actually  transforming  the  representation  of  relative  location.  There  is 
some  suggestive  evidence  that  this  may  be  true.  For  example,  Ratcliff  (1979) 
found  that  subjects  with  right  parietal  lobe  damage  have  difficulty  performing 
a  simple  mental  rotation  task.  Similarly,  Welsenberg  and  McBride  (1935)  found 
that  such  patients  have  difficulty  in  deciding  whether  two  shaded  sides  of  an 
unfolded  cube  would  be  adjacent  when  the  sides  were  folded  to  form  the  cube, 
and  Le  Doux,  Wilson,  and  Gazzaniga  (1977)  found  that  the  isolated  right 
hemisphere  of  a  split-brain  patient  was  better  at  spatlo-manipulatlon  tasks. 
However,  other  studies  have  provided  mixed  evidence  for  a  simple 
right-hemisphere  specialization  for  image  transformations  (e.g.,  see  Butters, 
Barton,  and  Brody,  1970;  De  Renzi  and  Faglioni,  1967;  see  chapter  6  of  De 
Renzi,  1982  for  a  review).  Similarly,  studies  of  normal  subjects  receiving 
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lateralized  stimuli  have  produced  mixed  results  (see  Cohen,  1982;  Simion, 
Bagnara,  Bisiacchi,  Roncato,  and  Umilta,  1980).  Overall,  the  findings  can  most 
easily  be  interpreted  as  indicating  bilateral  involvement,  which  is  not 
surprising  if  the  present  theory  is  correct. 

The  theory  also  leads  us  to  expect  that  a  left-hemisphere  module  will 
be  used  when  complex  shapes  are  transformed  in  an  image.  Kosslyn,  Berndt,  and 
Doyle  (in  press)  report  results  that  support  this  claim.  Two  aphasic  patients 
were  tested,  both  of  whom  had  severe  left-hemisphere  damage;  one  patient 
corresponded  quite  closely  to  the  classic  syndrome  of  Broca's  aphasia  and  one 
corresponded  quite  closely  to  the  syndrome  of  Wernicke's  aphasia.  These 
patients  were  asked  to  image  two-dimensional  analogues  of  the  Shepard-Metzler 
stimuli  illustrated  in  Figure  1.  These  shapes  were  formed  by  selecting  five 
cells  in  a  4  x  5  grid  that  were  each  connected  to  at  least  one  other  cell,  and 
eliminating  all  but  these  cells  (producing  a  set  of  connected  boxes).  The 
subjects  were  shown  a  pair  of  these  forms,  and  asked  whether  they  were 
identical  irrespective  of  orientation  about  the  circle;  the  left  form  was 
always  vertical  and  the  right  was  at  a  variety  of  orientations.  On  half  of  the 
trials  the  two  forms  were  identical,  and  on  half  they  were  mirror-reversals  of 
one  another.  The  results  were  clearcut:  both  patients  showed  decrements  for 
mental  rotation.  Indeed,  the  rate  of  rotation  was  almost  an  order  of  magnitude 
slower  than  that  of  a  control  group.  This  finding  suggests  that  the  left 
hemisphere  plays  some  role  in  the  rotation  of  complex  forms. 

The  critical  test  of  the  theory  of  image  transformation  will  be  to 
compare  performance  on  multipart  stimuli  versus  single  part  stimuli  (e.g.,  of 
the  sort  used  by  Marmor  and  Zaback,  1976).  If  the  theory  is  correct,  the 
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CLEANUP  module  is  required  only  with  multipart  stimuli.  If  so,  then  the 
putative  right-hemisphere  superiority  with  the  MOVE  module  ought  to  be  apparent 
when  one-part  stimuli  are  used. 

VI.  NEUROPSYCHOLOGICAL  EVIDENCE  FOR  SPECIALIZED 
REPRESENTATIONS  OF  RELATIONS 

A  major  claim  of  the  theory  is  that  the  "dorsal"  system  becomes 
differentiated  in  the  two  hemispheres.  The  left  dorsal  system  putatively 
becomes  more  effective  at  assigning  a  spatial  relation  to  a  category,  such  as 
"left  of"  or  "attached  to".  The  right  dorsal  system  putatively  becomes  more 
effective  at  representing  position  as  a  point  in  space  relative  to  a  single 
origin.  Kosslyn  and  Barrett  (in  preparation)  set  out  to  test  this  claim 
directly  in  two  simple  experiments:  First,  subjects  (normal  college  students) 
were  shown  stimuli  like  those  Illustrated  in  Figure  7.  These  stimuli  were  line 
drawings  of  blobs,  with  a  dot  being  either  on  the  line  or  outside  of  it.  The 
subjects  were  asked  to  fixate  directly  ahead,  and  a  stimulus  was  presented  to 
one  side  or  the  other.  For  one  set  of  trials,  the  subjects  were  to  respond 
"true"  if  the  dot  was  on  the  line,  and  "false"  if  it  was  off  the  line.  For 
another  set  of  trials,  the  subjects  were  to  respond  "true"  if  the  dot  was 
within  2  mm  of  the  line,  and  "false"  if  it  was  farther  than  2  mm  from  it 
(subjects  were  shown  what  a  2  mm  distance  looked  like  at  the  beginning  of  the 
experiment) .  Our  prediction  was  that  the  left  hemisphere  would  be  more 
effective  at  categorizing  the  dot/line  relation  as  "on"  or  "off,"  whereas  the 
right  hemisphere  would  be  more  effective  at  representing  the  metric  spatial 
relation.  As  is  evident  in  Figure  8,  these  predictions  were  borne  out:  The 
on/off  judgment  was  easier  when  the  stimuli  were  presented  to  the  left 
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hemisphere,  whereas  the  near/far  judgment  was  easier  when  the  stimuli  were 
presented  to  the  right  hemisphere.  The  near/far  task  was  also  easier  In 
general,  which  reflects  the  particular  stimuli  we  used  (the  distances  were  such 
that  the  discrimination  was  relatively  easy). 

A  second  experiment  was  conducted  to  provide  convergent  evidence  for 
the  claim.  Now,  subjects  saw  stimuli  consisting  of  a  plus  and  a  minus  sign, 
placed  side  by  side.  On  half  the  trials  the  plus  was  on  the  right,  and  on  half 
the  trials  the  plus  was  on  the  left;  In  addition,  on  half  of  each  of  these 
types  of  trials  the  stimuli  were  less  than  1  inch  apart,  whereas  on  the  other 
half  they  were  greater  than  1  inch  apart.  Subjects  again  began  each  trial  by 
fixating  straight  ahead,  and  a  pair  of  stimuli  was  laterallzed.  Subjects  again 
participated  in  two  sets  of  trials.  In  one,  they  simply  decided  whether  or  not 
the  plus  was  to  the  right  of  the  minus.  We  expected  the  left  hemisphere  to  be 
better  at  this  sort  of  categorical  judgment.  In  the  other  set  of  trials,  the 
subjects  decided  whether  the  stimuli  were  closer  or  farther  than  an  inch  apart. 
We  expected  the  right  hemisphere  to  be  better  at  this  sort  of  metric  judgment. 
These  expectations  were  confirmed,  as  is  illustrated  in  Figure  9. 


INSERT  FIGURES  7,  8,  9  ABOUT  HERE 

The  present  theory  is  also  consistent  with  numerous  findings  already  in 
the  literature.  For  one,  it  is  typically  reported  that  different  types  of 
deficits  in  drawing  occur  after  damage  to  the  left  or  right  hemisphere  (see 
De  Renzi,  1982).  When  the  right  hemisphere  is  damaged,  the  drawings  often 
contain  correct  details,  which  are  locally  organized  correctly,  but  the  global 
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organization  is  awry.  The  effects  of  right  hemisphere  damage  would  be  to 
disrupt  the  coordinate,  single-origin,  representation.  Thus,  the  overall 
coherence  of  form  will  be  disrupted;  when  the  left  hemisphere  is  operating 
alone,  it  will  have  the  local  pairwise  relations,  but  not  the  overall  form.  In 
contrast,  when  the  left  hemisphere  is  damaged,  the  drawings  preserve  the  global 
organization  but  lack  detail.  Presumably,  without  the  PUT  module,  parts  of 
nonrigid  objects  will  not  be  able  to  be  placed  in  their  correct  relative 
locations.  For  rigid  objects  (i.e.,  those  which  have  a  single  version  in  which 
parts  are  rigidly  arranged),  on  the  other  hand,  we  have  no  reason  to  expect  a 
right  hemisphere  deficit. 

The  present  theory,  then,  predicts  that  the  hemispheres  should  deal 
with  rigid  and  nonrigid  objects  in  different  ways.  A  face  of  an  individual 
person  is  an  example  of  an  object  that  is  essentially  rigid.  That  is,  the 
location  of  the  eyes,  eyebrows,  nose,  ears,  hairline,  and  mouth  do  not  vary. 
Facial  expressions  change  the  shape  of  parts,  but  do  not  alter  their  positions. 
In  addition,  the  metric  relations  among  the  parts  is  not  something  to  be 
Ignored  in  recognition,  as  is  the  case  with  the  various  configurations  of  the 
parts  of  nonrigid  objects  (e.g.,  the  configuration  of  the  limbs  of  a  person). 
Thus,  it  is  Interesting  that  the  right  hemisphere  seems  to  have  a  special  role 
in  face  recognition.  It  is  well-documented  that  the  right  hemisphere  is  better 
able  to  recognize  faces  than  the  left  (e.g.,  see  Gazzaniga  and  Smylie,  1983). 

In  most  cases,  the  relations  among  parts  must  be  preserved  if  a  face  is  to  be 
recognized,  so  these  data  are  prima  facie  evidence  that  the  right  hemisphere  is 
in  fact  preserving  the  relations  among  parts  of  faces  (although  I  claim  that 
the  relations  are  relative  to  a  single  origin,  not  pairwise).  Note,  however, 
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that  some  faces  can  be  recognized  on  the  basis  of  distinctive  characteristics 
(e.g.,  Nixon's  jowls  and  nose),  and  so  the  left  hemisphere  should  be  able  to 
recognize  faces  using  this  sort  of  Information  (see  Diamond  and  Carey,  1977; 
Etcoff,  in  press).  To  my  knowledge,  no  one  has  systematically  Investigated  how 
left-hemisphere  damaged  patients  draw  faces.  This  would  be  difficult  to  do 
because  left  hemisphere  patients  will  often  suffer  from  aphasia  (making  it 
difficult  to  convey  the  Instructions)  and  right-side  hemiplegia  (making  it 
difficult,  or  impossible,  to  use  the  right  hand).  In  addition,  drawing  is  a 
problematic  dependent  measure  even  with  right-hemisphere  damaged  patlnts,  given 
that  the  damage  can  cause  disruption  to  output  modules  in  the  right  hemisphere, 
which  would  mask  the  intact  functioning  of  the  representational  system  feeding 
into  these  modules. 

In  the  same  vein,  we  can  also  account  for  the  findings  of  Martin  (1979) 
and  Sergent  (1982)  on  preferred  level  of  analysis  of  the  two  hemispheres.  This 
research  was  derived  from  that  of  Navon  (1977),  who  showed  subjects  large 
letters  that  were  constructed  by  arranging  numerous  copies  of  a  small  letter. 
Subjects  were  asked  to  look  for  target  letters,  which  could  be  presented  either 
as  a  large,  composite  figure,  or  as  smaller,  constituent  figures.  Navon  found 
a  "global  precedence”  effect,  with  subjects  detecting  large  targets  faster  than 
small  ones.  It  is  interesting  that  when  stimuli  of'  this  type  were  lateralized 
in  the  experiments  of  Martin  (1979)  and  Sergent  (1982),  the  left  hemisphere  was 
faster  when  the  target  was  the  small  letter,  whereas  the  right  hemisphere  was 
faster  when  the  target  was  the  large  letter. 

Our  account  of  these  results  again  rests  on  our  analysis  of  the 
purposes  of  the  left  and  right  hemisphere  spatial  representational  systems.  I 
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have  argued  chat  the  coordinate  representation  in  the  right  hemisphere  is  used 
primarily  in  the  service  of  navigation.  If  so,  then  information  about  small 
|  spatial  variations  (details)  may  be  used  less  often  than  information  about  more 

coarse  variations.  In  this  case,  the  modules  that  process  coarser  information 
will  come  to  generate  coordinate  representations  more  quickly  (because  of  the 
|  principle  of  exercise)  in  the  right  hemisphere.  These  representations  may  then 

be  template-matched  against  representations  of  rigid-patterns  stored  in  the 
right  hemisphere.  This  operation  presumably  can  be  performed  very  quickly, 
more  quickly  than  generating  descriptions  of  the  interpart  relations  and 
comparing  them  to  stored  Information  in  the  left  hemisphere.  On  the  other 
hand,  the  left  hemisphere  system  purportedly  categorizes  parts  and  relations 
|  among  them.  Hence,  the  modules  that  represent  information  about  parts 

presumably  come  to  be  more  effective  in  the  left  hemisphere. 

The  present  account  of  the  hemispheric  specialization  results  is 

supported  by  the  finding  that  the  "global  precedence"  effect  (found  when 

stimuli  are  not  lateralized)  can  be  eliminated  if  the  large  pattern  is 

distorted  (Hoffman,  1980).  If  the  present  theory  is  correct,  this  manipulation 

would  impair  the  matching  against  rigid  patterns  stored  in  the  right 
hemisphere.  When  the  input  pattern  is  distorted,  it  can  no  longer  be  easily 

matched  against  such  a  rigid  template,  and  the  lef ^hemisphere  system  becomes 

relatively  more  effective. 

VII.  CONCLUSIONS 

This  paper  has  attempted  to  illustrate  the  value  of  using  a 
computational  approach  in  conjunction  with  neurological  data  and  methodologies 
from  cognitive  psychology.  The  results  of  taking  seriously  the  neurophysiology 
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and  neuroanatomy  have  been  somewhat  surprising.  On  the  face  of  things,  the 
notion  of  splitting  apart  processing  of  parts  from  processing  of  relations 
seems  counterintuitive  to  many.  In  addition,  it  was  something  of  a  surprise  to 
discover  that  imagery  should  be  selectively  better  for  some  tasks  in  the  left 
hemisphere.  The  standard  view  that  imagery  is  a  right  hemisphere  function  is 
incorrect. 

But  is  the  present  approach  really  an  advance  over  a  more  common  sense, 
intuitive  approach?  Yes,  it  is,  for  two  reasons:  First,  the  usual  accounts 
are  always  vague.  For  example,  what  does  it  mean  to  say  that  the  right 
hemisphere  is  "perceptual"?  Many  researchers  in  artificial  intelligence  (e.g., 
see  Winston,  1975)  claim  that  perceptual  representations  are  not  different  in 
kind  from  language  representations.  Or  what  does  it  mean  to  say  that  the  right 
hemisphere  is  "intuitive"?  Or  "synthetic"?  and  so  on.  The  computational 
approach  Introduces  precision,  which  allows  theories  to  generate  clear 
predictions.  Second,  the  present  approach  is  an  advance  over  a  more  common 
sense,  intuitive  approach  because  those  theories  are  often  Incorrect.  To  the 
extent  that  they  make  concrete  claims,  they  are  usually  overly  general  and 
coarse.  To  cite  the  present  example,  imagery  is  not  a  left  or  a  right 
hemisphere  function.  "Imagery"  is  too  coarse  a  level  of  analysis;  the  function 
decomposes  into  numerous  sub-abilities,  which  in  turn  are  carried  out  by 
numerous  processing  modules.  To  further  complicate  matters,  some  of  these 
modules  may  be  dedicated  to  a  particular  ability  (e.g.,  the  MOVE  module), 
whereas  other  modules  may  be  used  in  carrying  out  a  number  of  abilities  (e.g, 
the  FIND  module,  which  purportedly  is  recruited  in  image  inspection, 
generation,  and  transformation).  And  the  individual  processing  modules  need 
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not  be  lateralized  in  the  same  way.  The  computational  approach  is  ideally 
suited  for  decomposing  abilities  into  components,  which  gives  one  enormous 
conceptual  power  in  analyzing  the  functional  differences  between  the 
hemispheres . 

Perhaps  the  greatest  promise  of  the  present  approach  over  previous  ones 
lies  in  its  potential  for  explaining  variation.  That  is,  not  everyone  is 
lateralized  in  the  same  way.  For  every  generalization  about  localization  of 
function,  an  exception  can  usually  be  found.  Indeed,  it  is  often  difficult  to 
make  generalizations  at  all;  for  example,  typical  right  parietal  syndromes  do 
not  always  occur  following  right  parietal  damage,  but  may  occur  following 
damage  to  other  structures  (see  DeRenzi,  1982).  This  sort  of  individual 
difference  is  to  be  expected  if  the  present  theory  is  correct,  because  the 
mechanism  of  differentiation  is  sensitive  to  the  parameter  values  used.  For 
example,  the  precise  "transmission  time"  over  the  corpus  callosum  is  Important 
for  ensuring  that  modules  on  the  same  side  as  the  "target”  module  are  selected 
over  other-sided  ones.  If  transmission  is  relatively  fast,  same-sided  modules 
will  not  be  selected  as  often  over  input  from  other-sided  modules,  and  hence 
will  not  become  as  differentially  exercised.  Interestingly,  Witelson  (1985) 
found  large  variations  in  the  number  of  fibers  in  the  corpus  callosa  of 
different  people,  with  more  strongly  right-handed  people  having  smaller 
callosa;  this  result  makes  perfect  sense  if  a)  more  strongly  right-handed 
people  are  more  strongly  lateralized,  and  b)  fewer  fibers  result  in  slower 
transmission  times — and  hence  greater  probabilities  that  modules  in  the  same 
hemisphere  as  "target”  modules  will  become  more  exercised.  In  addition,  women 
have  shorter,  thicker  corpus  callosa  than  men,  which  may  imply  faster 
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transmission  times  and  hence  less  pronounced  lateralization.  The  usefulness  of 
these  sorts  of  speculations  cannot  be  answered  without  a  fully  functioning 
computer  simulation  model,  which  is  currently  being  developed  in  our 
laboratory. 

A  virtue  of  a  theory  of  processing  modules  is  that  the  theory  itself 
can  be  modular.  That  is,  we  can  add  to  the  theory  by  expanding  the  number  of 
modules  examined,  without  affecting  the  portions  of  the  theory  previously 
developed.  Indeed,  this  ability  to  simply  add  without  having  to  modify  the 
previously  posited  modules  is  one  sign  that  the  earlier  theory  is  correct,  that 
the  actual  modules  have  been  characterized.  This  modular  property  is 
fortunate,  given  the  ultimate  goals  of  the  present  theory.  The  theory 
ultimately  should  allow  us  to  account  for  all  of  the  major  findings  on  visual 
hemisphericity  (e.g.,  see  Kinsbourne  and  Hiscock,  1983;  Springer  and  Deutsch, 
1981).  In  particular,  we  need  a  precise  account  of  results  like  those  of 
Bisiach  and  Luzzatti  (1978)  and  Blsiach,  Luzzatti,  and  Perani  (1979),  who  found 
"unilateral  visual  neglect"  in  imagery.  That  is,  they  found  that  damage  to  the 
right  parietal  lobe  can  result  in  a  patient's  ignoring  the  left  half  of  not 
only  what  is  seen,  but  what  is  imaged.  This  result  presumably  has  something  to 
do  with  processing  of  the  coordinate  spatial  relation  we  have  posited,  but  the 
precise  account  is  not  yet  clear.  In  addition,  Newcombe  and  Russell  (1969) 
found  selective  deficits  in  different  spatial  abilities  following  right 
parietal  damage,  and  the  deficits  depended  in  part  on  the  precise  location  of 
the  damage  within  the  parietal  lobe  itself.  The  nature  of  this  selective 
breakdown  must  be  specified  by  the  theory. 

A  new  way  of  testing  a  theory  like  the  present  one  will  be  to  construct 
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a  simulation  model,  and  then  to  "lesion"  it.  The  effects  of  disrupting  the 
model  in  selected  ways  will  constitute  precise  predictions  of  behavioral 
deficits  following  brain  damage.  It  will  also  be  interesting  to  do  the 
obverse:  To  start  with  a  known  deficit,  and  see  what  sorts  of  "lesions"  are 
necessary  to  make  the  simulation  mimic  the  deficit.  If  this  procedure  is 

successful,  it  may  be  that  we  are  on  the  road  to  developing  a  new,  more  precise 

I 

diagnostic  tool. 


I 

I 
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1.  The  term  image  is  ambiguous,  referring  both  to  the  experience  itself  and  the 
internal  representation  that  gives  rise  to  the  percept-like  experience.  This 
representation  is  taken  to  correspond  to  a  particular  brain  state.  In  the 
present  theory,  the  term  "image"  refers  to  the  internal  representation,  not  the 
experience  itself.  We  assume  that  the  conscious  experience  accompanies  the 
"image"  brain  state  (for  some  unspecified  reason),  and  thus  the  experience  of 
"having  a  mental  image”  can  be  taken  as  a  hallmark  that  the  underlying 
"imagery"  brain  state  is  present. 
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Figure  Captions 

Figure  1.  Examples  of  the  stimuli  used  by  Shepard  and  Metzler  (1971). 

Figure  2.  Time  to  rotate  Images  in  the  picture  plane  and  in  depth  in  the 
Shepard  and  Metzler  (1971)  experiment. 

Figure  3.  The  map  that  was  imaged  and  later  scanned  in  the  experiment  by 
Kosslyn,  Ball  and  Reiser  (1978). 

Figure  4.  Time  to  scan  between  pairs  of  locations  on  the  Imaged  map  in  the 
Kosslyn,  Ball  and  Reiser  (1978)  experiment. 

Figure  5.  The  ventral  and  dorsal  visual  systems. 

Figure  6.  Top  view  of  a  brain,  illustrating  how  input  in  the  left  and  right 
visual  fields  is  processed. 

Figure  7.  Examples  of  stimuli  used  in  the  Kosslyn  and  Barrett  study. 

Figure  8.  Results  of  the  Kosslyn  and  Barrett  experiment  in  which  subjects 

judged  whether  a  dot  was  on  or  off  a  blob  or  whether  a  dot  was  near  or  far 
from  the  blob. 

Figure  9.  Results  from  the  Kosslyn  and  Barrett  experiment  in  which  subjects 
judged  whether  an  X  was  to  the  left  or  right  of  an  0  or  whether  the  X  was 
within  an  inch  of  an  0. 
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